Lets assume the monthly growth rate follows following distribution:
What is the distribution of heights of 10000 children at different ages?
This is an example the displays the central limit theorem, which states that the result of processes that manifest as the sum of many small identical and independently distributed events are normally distributed.
One way to explain why this is the case is to see that there are more possible combinations of events that lead to average outcomes than possible combination of events that lead to extreme events.
For instance, assume that you are throwing a fair coin four times, and each time heads shows you receive one credit point and each time tail shows you loos a credit point. The next table shows that there are more possible sequences that lead to an end result of 0 credit points than sequences that lead to 4 or more credit points.
| Permutation | event 1 | event 2 | event 3 | event 4 | sum |
|---|---|---|---|---|---|
| 1 | -1 | -1 | -1 | -1 | -4 |
| 2 | 1 | -1 | -1 | -1 | -2 |
| 3 | -1 | 1 | -1 | -1 | -2 |
| 4 | 1 | 1 | -1 | -1 | 0 |
| 5 | -1 | -1 | 1 | -1 | -2 |
| 6 | 1 | -1 | 1 | -1 | 0 |
| 7 | -1 | 1 | 1 | -1 | 0 |
| 8 | 1 | 1 | 1 | -1 | 2 |
| 9 | -1 | -1 | -1 | 1 | -2 |
| 10 | 1 | -1 | -1 | 1 | 0 |
| 11 | -1 | 1 | -1 | 1 | 0 |
| 12 | 1 | 1 | -1 | 1 | 2 |
| 13 | -1 | -1 | 1 | 1 | 0 |
| 14 | 1 | -1 | 1 | 1 | 2 |
| 15 | -1 | 1 | 1 | 1 | 2 |
| 16 | 1 | 1 | 1 | 1 | 4 |
Now lets do the same experiment again, except that we are not looking at 4, but 16 tosses, which leads to \(2^{16}\) or 6.5536^{4} possible sequences. Here is the distribution of credit points.
One popular device to display such a process is a Galton1 board:
What is the association between length and weight at birth?
We simulate some data:
dt = data.frame(length = rnorm(250,50,5))
expected_weight = 3.5 + scale(dt$length)*.5
dt$weight = rnorm(250,expected_weight,.5)
When data covary,we look at e.g. a scatter plot, which shows the joint distribution, to see how the data are related.